We are studying the effect of an inhibitor of the cGAS-STING signaling pathway H151 on T-ALL model cell line Jurkat.
The biological experiment revealed that H151 causes cell death, so the pathway is important for survival of T-ALL.
We want to explore the differential expressed genes between normal condition and by inhibiting the pathway. Thus we are going to perform a differential expression analysis (DEA) followed by a pathway enrichment analysis (PEA)
Following a basic RNA-seq pipeline analysis
First let’s view the distribution of the different bio types we have in our data :
In the downstream analysis (DEA),
we’ll be focusing on the top 2 biotypes (protein_coding and
lncRNA). Additional filtering will be applied :
MaxCount_threshold = 20 (At least 1
sample must have a read count over that value)CpmCount_threshold = 0.5 (Count per
million reads threshold)MinSample = 3 (Samples that should
pass the cpm threshold)Now to understand the global gene expression landscape and to assess the quality control of our data, we need to perform a dimentionality reduction analysis : Principal component analysis (PCA)
The PCA :
So we need to perform some data transformation first on
raw counts.
We’ll be using a Variance Stabilizing
Transformation (VST) from the DESeq2 package. And
This will :
DESeq2 computes size
factors using a median-of-ratios method)